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O ■ ABSTRACT 
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A quantum computer directly manipulates information stored in the state of quantum mechanical systems. The 
available operations have many attractive features but also under ly severe restrictions, which complicate the 
design of quantum algorithms. We present a divide-and-conquer approach to the design of various quantum 
algorithms. The class of algorithm includes many transforms which are well-known in classical signal processing 
I applications. We show how fast quantum algorithms can be derived for the discrete Fourier transform, the 
Walsh-Hadamard transform, the Slant transform, and the Hartley transform. All these algorithms use at most 
\ 0(log^ N) operations to transform a state vector of a quantum computer of length N. 
^ . 
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(Nl . 1. INTRODUCTION 

^ I ' Discrete orthogonal transforms and discrete unitary transforms have found various applications in signal, image, 

I and video processing, in pattern recognition, in biocomputing, and in numerous other areas [1-11]. Well- 

CSJ ' known examples of such transforms include the discrete Fourier transform, the Walsh-Hadamard transform, 

I the trigonometric transforms such as the Sine and Cosine transform, the Hartley transform, and the Slant 

' transform. All these different transforms find applications in signal and image processing, because the great 

CIhI variety of signal classes occuring in practice cannot be handeled by a single transform. 

I On a classical computer, the straightforward way to compute a discrete orthogonal transform of a signal 

■ vector of length N takes in general 0{N^) operations. An important aspect in many applications is to achieve 

^ I the best possible computational efficiency. The examples mentioned above allow an evaluation with as few as 

D^' O(A^logA^) operations or - in the case of the wavelet transforms - even with as little as 0{N) operations. In 

^ I view of the trivial lower bound of fl{N) operations for matrix-vector-products, we notice that these algorithms 

'.^ • are optimal or nearly optimal. 

}_j ' The rules of the game change dramatically when the ultimate limit of computational integration is ap- 

! proached, that is, when information is stored in single atoms, photons, or other quantum mechanical systems. 
The operations manipulating the state of such a computer have to follow the dictum of quantum mechanics. 
However, this is not necessarily a limitation. A striking example of the potential speed-up of quantum compu- 
tation over classical computation has been given by Shor in 1994. He showed that integers can be factored in 
polynomial time on a quantum computer. In contrast, there are no polynomial time algorithms known for this 
problem on a classical computer. 

The quantum computing model does not provide a uniform speed-up for all computational tasks. In fact, 
there are a number of problems which do not allow any speed-up at all. For instance, it can be shown that a 
quantum computer searching a sorted database will not have any advantage over a classical computer. On the 
other hand, if we use our classical algorithms on a quantum computer, then it will simply perform the calculation 
in a similar manner to a classical computer. In order for a quantum computer to show its superiority one needs 
to design new algorithms which take advantage of quantum parallelism. 

A quantum algorithm may be thought of as a discrete unitary transform which is followed by some I/O 
operations. This observation partially explains why signal transforms play a dominant role in numerous quantum 
algorithms. "'^^^^^ Another reason is that it is often possible to find extremely efficient quantum algorithms for 



the discrete orthogonal transforms mentioned above. For instance, the discrete Fourier transform of length 
A/' = 2" can be implemented with 0(log^ N) operations on a quantum computer. 

2. QUANTUM COMPUTING 

The basic unit of information in classical computation is a bit, a system with two distinguishable states rep- 
resenting logical values or 1. We mentioned in the introduction that a quantum computer will store such 
information in the states of a quantum mechanical system. Suppose that the system has two distinguishable 
states. We will denote these states by |0) and where the notation reminds us that these states represent 
the logical values and 1. 

A potential candidate for the storage of a single bit is given by a spin-| particle, such as an electron, proton, 
or neutron. We can choose the state with the rotation vector pointing upward (spin-up) and the state with 
the rotation vector pointing downward (spin-down) to represent and 1, respectively. However, we know from 
quantum mechanics that quantum system can be in a superposition of states. In the case of a spin-| particle, 
a superposition 

2^) =« + & 
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yields a state which rotates about a different axis. The coefficients a,b in this superposition are complex 
numbers, which determine this spin axis. 

The consequent abstraction of the preceding example leads to the notion of a quantum bit, or shortly qubit, 
the basic unit of information in quantum computation. A quantum bit is given by a superposition of the states 
|0) and |1) such as 

IV) =a|0)+6|l), a,beC. 

The value of a quantum bit remains uncertain until it is measured. A measurement will collapse \tjj) to either the 
state |0) or to the state |1). The coefficients a and b determine the probability of outcome of this measurement, 
namely 
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In either case, we will learn the outcome of the measurement. Since proportional states lead to the same 
measurement results, it is conventially assumed that the state is normalized to length 1, i.e., it is assumed that 
|a|2 + |6|2 = 1 holds. 

The measurement allows to implement a fair coin flip on a quantum computer. Indeed, preparing a quantum 

bit in the state \tp) = --i= |0) + |1), and measuring the result yields either or 1. According to the above 
rule, either event will occur exactly with probability 1/2. This example might suggest that computations on a 
quantum computer are indeterministic and maybe even somewhat fuzzy. However, this is not the case. We will 
sec in a moment that all operations apart from measurements arc completely deterministic. The only operations 
that might introduce some randomized behaviour are the measurements, which - as Penrose puts it - 'magnify 
an event from the quantum level to the classical level' [15, pp. 7-8]. 

We discuss now the deterministic operations on a quantum computer. We begin with the simplest case, the 
operations which manipulate the state of a single quantum bit. First of all, it should be noted that the states |0) 
and |1) can be understood as an orthonormal basis of the complex inner product space C^. It is customary to 
associate the base states |0) and |1) with the standard basis vectors (1,0)* and (0, 1)*, respectively. Therefore, 
a quantum bit in the state a |0) -|- 6 11) is represented by the state vector 



A deterministic operation has to realize a unitary evolution of the quantum state, following the rules of quantum 
mechanics. In other words, a single quantum bit operation is given by a unitary operator U:C^ ^ acting 
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Figure 1. Schematic for a single qubit operation. The diagrams are read from left to right, reflecting the abstraction of 
time flow. 



on the state of the quantum bit. There is a graphical notation for quantum operations. The schematic for such 
a single qubit operation U is shown in Figure 1. 

Examples of single qubit operations are given by 



X 



H 



1 

71 



The operation X realizes a NOT operation, X |0) = |1) and X |1) = |0). The operation Z implements a phase 
shift operation, Z |0) = |0) and Z\l) = — An extremely useful operation is given by the Hadamard gate H, 
which is for instance used to create superpositions. 



The Hadamard gate should be familiar to readers with a background in signal processing or coding theory. In 
the following, we will keep the notations for these gates without further notice. 

The operations get more interesting in the case of multiple quantum bits. Quantum mechanics tells us that 
the state space of a combined quantum system is given by the tensor product of the state spaces of its parts. 
A remarkable consequence of this rule is that the state space of a system with n quantum bits is given by the 
vector space = (g) • • • (g) (n-fold tensor product). This simply means that the dimension of the state 
space doubles with the addition of a single quantum bit. 

The state of a system with two quantum bits can thus be described by a vector (aoo, Ooi, ^^lo, an)* S or, 
isomorphically, by the vector 



= aoo |0) ® |0) + aoi |0) ® |1) + aio |1) ® |0) + an |1) ® |1) e 



(1) 



The latter notation is often abbreviated to aoo |00) + aoi |01) + aio |10) + an |11). The label a;ia;o in the Dirac 
ket notation \xiXo) specifics a location in the quantum memory. 

A dramatic consequence of the tensor product structure of the quantum memory can be illustrated with a 
single qubit operation. Suppose that we apply a single qubit operation, say the Hadamard gate H, on the least 
significant bit of (1). The resulting state is 

IV'') = aoo|0)(g)fl"|0)+aoi|0)®fl'|l)+aio|l)®fl"|0)+an|l)®if |1) 

= ((aoo + aoi) |0) |0) + (aoo - aoi) |0) O |1) + (aio + an) |1) O |0) + (aio - an) |1) |1) ) • 

In more traditional mathematical notation, we can formulate this as the action of the matrix 
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Therefore, the resulting operation is (1 ® H) \ip) = {ij/). Although we act only on one quantum bit, we see 
every single position of the state vector is manipulated. This is a striking example of quantum parallelism. We 
observe that a butterfly structure, well-known from many signal processing algorithms, can be implemented 
with a single operation on a quantum computer. 
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Figure 2. Single qubit operation U. The left side shows the schematic for U acting on the least significant qubit; this 
circuit implements the matrix 1 (g> {/. The figure on the right shows the single qubit gate acting on the most significant 
qubit; this circuit implements the matrix Z7 ig) 1. 



The direct generalization to arbitrary single qubit operations is shown in Figure 2. In general, a single 

qubit operation is specified by a unitary 2x2 matrix U, and the position of the target qubit to which U is 
applied. Suppose that the target qubit position is i, then each state \xn-i ■ ■ ■ Xi+iXiXi-\ . . . xq) is unconditionally 
transformed to \xn-i ■ ■ ■ Xi+i) U \xi) \xi-i . . .xo), where Xk € {0, 1}. 

We can specify more elaborate gates, which allow to create an interaction between quantum bits. Let Co and 

Ci be two disjoint sets of quantum bit positions, neither of which contains the target bit position i. A conditional 
?7-operation maps the state \xn-i . . . Xi-^-iXiXi^i . . . xq) to the state |x„_i . . . x^+i) [/ jxi-i . . . .tq), in 
case Xi = for all i G Co and Xj = 1 for all j £ Ci. The state remains unchanged in all other cases. The set Cq 
describes the set of zero-conditions and Ci the set of one-conditions. In the schematics, we will use the symbol 
o to denote a zero-condition and the symbol • to denote a one-condition. Figure 3 shows the simplest, but most 
important, conditional quantum gate - the controlled NOT operation. 
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Figure 3. The controlled NOT gate is a reversible XOR gate. The states |00) and |01) remain unchanged, since the 
most significant qubit must be 1. If the most significant bit is 1, then a NOT operation is applied to the least significant 
bit. Therefore, |10) is mapped to jll), and is mapped to |10). 



We can use controlled NOT gates to get an interaction between different quantum bits. For example, consider 
the circuit in Figure 4. This circuit swaps the states of the two quantum bits. 
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Figure 4: Circuit which swaps the state of two quantum bits. 



Engineering controlled U operations is in general a difficult task. We will refer to controlled NOT gates 
with a single control bit and to single qubit operations as elementary gates. Elementary quantum gates are 
available in all mature quantum computing technologies. It can be shown that it is possible to implement a 
general controlled U operation with O(logA^) elementary gates. We will always refer to elementary gates in 
gate counts, but we will use multiply controlled U gates for the sake of brevity in circuit descriptions. There 
exist standard algorithms which transform these more general gates into a sequence of elementary gates. 



3. DIVIDE- AND-CONQUER METHODS 

We have seen that a number of powerful operations are available on a quantum computer. Suppose that we 
want to implement a unitary or orthogonal transform U G C/(2") on a quantum computer. The goal will be 
to find an implementation of U in terms of elementary quantum gates. Usually, our aim will be to find first a 
factorization of U in terms of sparse structured unitary matrices Ui, 

U = UiU2---Uk, 

where, of course, k should be small. The philosophy being that it is often very easy to derive quantum circuits 

for structured sparse matrices. For example, if we can find an implementation with few multiply controlled 
unitary gates for each factor Ui, then the overall circuit will be extremely efficient. 



The success of this method depends of course very much on the availabiUty suitable factorization of U. 
However, in the case orthogonal transfroms used in signal processing, there arc typically numerous classical 
algorithms available, which provide the suitable factorizations. It should be noted that, in principle, an expo- 
nential number of elementary gates might be needed to implement even a diagonal unitary matrix. Fortunately, 
we will see that most structured matrices occuring in practice have very efficient implementations. In fact, 
we will see that all the transforms of size 2" x 2" discussed in the following can be implemented with merely 
0(log^ 2") = 0{v?) elementary quantum gates. 

We present a simple - but novel - approach to derive such efficient implementations. This approach is based 
on a divide-and-conqucr technique. Assume that we want to implement a family of unitary transforms J/at, 
where iV = 2" denotes the length of the signal. Suppose further the family U n can be recursively generated 
by a recursive circuit construction, for instance, such as the one shown in Figure 5. We will give a generic 
construction for the family of prccomputation circuits Prc_ and the family of postcomputation circuits Post_. 
This way, we obtain a fairly economic description of the algorithms. 
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Figure 5. Recursive implementation of a family of quantum circuits Un ■ If the preparation circuit Pre/\:/2 and postcom- 
putation circuits Postjv/2 have small complexity, then the overall circuit family will have an efficient implementation. 



Assume that a total of P{N) elementary operations are necessary to implement the prccomputation circuit 
Pre7v/2 and the postcomputation circuit Postjv/2- Then the overall number T{N) of elementary operations can 
be estimated from the recurrence equation 

T{N) =T{N/2)+P{N). 

The number of operations T{N) for the recursive implementation can be estimated as follows: 
Lemma. If P{N) e Q{lof N), then T{N) e 0(logP+^ N). 



4. FOURIER TRANSFORM 

We will illustrate the general approach by way of some examples. Our first example is the discrete Fourier 
transform. A quantum algorithm implementing this transform found a most famous application in Shor's 
integer factorization algorithm. Recall that the discrete Fourier transform Fn of length N = 2"^ can be 
described by the matrix 

y/N^ h.k=0,...,N-l 

where w denotes a primitive A^-th root of miity, = cxp(27ri/A^). And i denotes a square root of —1. 

The main observation behind the fast quantum algorithm dates at least back to work by Danielson and 
Lanczos in 1942 (and is implicitly contained in numerous earlier works). They noticed that the matrix Fjv 
might be written as 

Fn = —Pn ( ^^^^ ^^^"^ I 

\/2 V Pn/2Tn/2 —Fn/2Tn/2 J 

where Pn denotes the permutation of rows given by Pn \bx) = \xb) with a; an n — 1-bit integer, and b a single 
bit, and Tn/2 '■= diag(l,a;,a;^, . . . denotes the matrix of twiddle factors. 



This observation allows to represent Fn by the following product of matrices: 
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This factorization yields an outline of an implementation on a quantum computer. The overall structure is 
shown in Figure 6. 
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Figure 6: The recursive structure of the quantum Fourier transform. 



It remains to detail the different steps in this implementation. The first step is a single qubit operation, 
implementing a butterfly structure. The next step is slightly more complicated. We observe that Tn/2 is a 
tensor product of diagonal matrices Dj = diag(l,a;^^ ^). Indeed, 

Tn/2 = Dn-i O . . . O r>2 O £>i. 

Thus, In/2 ® Tn/2 can be realized by controlled phase shift operations, see Figure 7 for an example. We then 
recurse to implement the Fourier transform of smaller size. The final permutation implements the cyclic rotation 
of the quantum wires. 




Figure 7: Implementation of the twiddle matrix Is © Ts. 



The complexity of the quantum Fourier transform can be estimated as follows. If we denote by R{N) the 
number of gates necessary to implement the DFT of length A/^ = 2" on a quantum computer, then Figure 6 
implies the recurrence relation 

R{N)= R{N/2) + (d{\ogN) 

which leads to the estimate R{N) = 0(log^ A''). 

It should be noted that all permutations PAf(l2 -Piv/2) • • • (lAr-2 <S) P4) at the end can be combined into a 
single permutation of quantum wires. The resulting permutation is the bit reversal, see Figure 8. 

Remark. Another explanation of the discrete Fourier transform algorithm is contained in [17]. Note that the 
row permutations are mistaken in that article. An approximate version of the discrete Fourier transform has 
been proposed by Coppersmith,^^ which saves some operations. 



Figure 8: The bit reversal permutation resulting from P8(l2 -P4)(l4 ® -P2)- 



5. THE WALSH-HAD AM ARD TRANSFORM 

The Walsh-Hadamard transform Wn is maybe the simplest instance of the recursive approach. This transform 
is defined by the Hadamard gates W2 = -ff in the case of signals of length 2. For signals of larger length, the 
transform is defined by 

Wn = (I2 Wn/2){H O ljv/2)- 
This yields the recursive implementation shown in Figure 9. 
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Figure 9: Recursive implementation of the Walsh-Hadamard transform. 

Since P{N) = 6(1), the Lemma in Section 3 shows that the number of operations T{N) S 0(logA''). It is 
of course trivial to see that in this case exactly log N operations are needed. 

6. THE SLANT TRANSFORM 

The Slant transform is used in image processing for the representation of images witli many constant or uniformly 
changing gray levels. The transform has good energy compaction properties. It is used in Intel's 'Indeo' video 
compression and in numerous still image compression algorithms. 

The Slant transform Sn is defined for signals of length A'' = 2 by the Hadamard matrix 

^^ = ^=71(1 -1)' 
and for signals of length N = 2^ , N > 2, by 

Sn = Qn(1'"' O^/O, (2) 

V JN/2 J 

where Ojv/2 denotes the all-zero matrix, and Qat is given by the matrix product 

Qn = P%{1n/2®Qn){H®1n/2)P%- (3) 

The matrices in (3) are defined as follows (see also [19]): In/2 is the identity matrix, H is the Hadamard matrix, 
and realizes the transposition (1, A^/2), that is, 

|1) = \N/2) , |iV/2) = |1) , and |a;) = \x) otherwise. 



The matrix is defined by \x) — \x) for all x except in the case x = N/2 + 1, where it yields the phase 
change P^ \N/2 + 1) = - \N/2 + 1). Finally 
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where ajv and 6jv are recursively defined by a2 = 1 and 
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^l + i{aN/2r 



and ajv = 26jv(ijv/2- 



It is easy to check that is a unitary matrix. 

The definition of the Slant transform suggests the following implementation. Equation (2) tells us that the 

input signal of a Slant transform of length is first processed by two Slant transforms of size N/2, followed by 
a circuit implementing Qn- We can write equation (2) in the form 
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= Qjv(l2 O Sn/2)- 



The tensor product structure I2 (81 S'jv/2 is compatible with our decomposition into quantum bits. This means 
that a single copy of the circuit Sn/2 acting on the lower significant bits will realize this part. It remains to give 
an implementation for Qn- Equation (3) describes Q at as a product of four sparse matrices, which are easy to 
implement. Indeed, the matrix P^ is realized by conditionally excerting the phase gate Z. The matrix H ®\n/2 
is implemented by a Hadamard gate H acting on the most significant bit. A conditional application of 
implements the matrix 1jy/2 © Qn- A conditional swap of the least and the most significant qubit realizes P^, 
that is, three multiply controlled NOT gates implement P^. The quantum circuit realizing this implementation 
is depicted in Figure 10. 
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Figure 10. Implementation of the Slant transform. The recursive step is realized by a single Slant transform of size 
'S'Ar/2. The next three gates implement P%, H (g) ljv/2, and lAr/2 ® Qn-, respectively. The last three gates implement P%. 
Thus, the implementation oi Qn totals five multiply controlled gates and one single qubit gate. 

Theorem 6.1. The Slant transform of length N = 2'^ can be realized on a quantum computer with at most 
0(log^ N) elementary operations (that is, controlled NOT gates and single qubit gates), assuming that additional 

workbits are available. 

Proof. Recall that a multiply controlled gate can be expressed with at most 0(log N) elementary operations 
as long as additional workbits are available. It follows from the Lemma in Section 3 that at most 0(log^ A^) 
elementary operations are needed to implement the Slant transform. □ 



7. THE HARTLEY TRANSFORM 

The discrete Hartley transform Hn is defined for signals of length = 2" by the matrix 



Hn = ( cos(27rM) + sin(27rM)) 

y/N^ Vfe,£=o,...,JV-l 



The discrete Hartley transform is very popular in classical signal processing, since it requires only real arithmetic 
but has similar properties. In particular, there are classical algorithms available, which outperform the fastest 
Fourier transform algorithms. We derive a fast quantum algorithm for this transform, again based on a recursive 
divide- and- conquer algorithm. A fast algorithm for the discrete Hartley transform based on a completely 
different approach has been discussed by Klappenecker and Rotteler.^^ 

The Hartley transform can be recursively represented as^° 
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where Qn is the permutation Qat \xb) = \bx), with b a single bit, separating the even indexed samples and the 
odd indexed samples; for instance, Q8{xo,xi,X2,X3,Xi,X5,xe,xrY = {xq, X2, X4, xe, xi, X3, X5, xr)* . The matrix 
BCm/2 is given by 
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The equation (4) leads to the implementation sketched in Figure 11. 
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Figure 11: Recursive implementation of tlie Hartley transform. 

It remains to describe the implementation of BCjqi2- It will be instructive to detail the action of the matrix 
BCjff2 on a- state vector of n — 1 qubits. We will need a few notations first. Denote by \bx) a state vector of 
n — 1 qubits, where b denotes a single bit and a; an n — 2 bit integer. We denote by x' the two's complement of 
x. We mean by a; = the number and by 1 the number 2"~^ — 1, that is, 1 has all bits set and has no bit 
set. Then the action of BCn/2 on \bx) is given by 

BCn/2 |00) = |00) , BCn/2 \0y) = \Qy) + s% \ly') , 

bCn/2 |oi) = |oi) , bCn/2 \iy) = si \0y') - 4 > 

where s% = sm{2nk/N) and c% = cos{2nk/N). 

We are now in the position to describe the implementation of i?Cjv/2 shown in Figure 12. In the first step, 
the least n — 2 qubits are conditionally mapped to their two's complement. More precisely, the input signal \bx) 
is mapped to \bx') if & = 1, and does not change otherwise. Thus, the circuit TC implements the involutary 
permutation corresponding to the two's complement operation. This can be done with 0{n) elementary gates, 
provided that sufficient workspace is available. In the next step, a sign change is done if 6 = 1, that is, 
\lx) 1-^ — \ lx), unless the input x was equal to zero, |10) 1-^ |10). The next step is a conditioned cascade of 



z 




z 




Rn/s 




Ri 









BC^ 



N/2 




Figure 12: Implementation of the matrix BCjv/2- 

rotations. The least significant bits determine the angle of the rotation on the (n — 1st) most significant qubit. 
The kth qubits exerts a rotation, 

_ / cos(27r2ViV) - sin(27r2 VA^) \ 
2" - sin(27r2ViV) cos(27r2V7V) ) ' 

on the most significant qubit. Finally, another two's complement circuit is conditionally applied to the state. 

One readily checks that the implementation indeed maps BCn/2 |00) to |0O) and BCn/2 |10) to |10). The 
input |0a;) is mapped to \0x) + as desired. Assume that the input is |la;) with x ^ 0. Then 

the state is changed to \lx') by the circuit TC, and after that its sign is changed, which yields — \lx'). The 
rotations map this state to s%f \0x') — \ ^x'). The final conditional two's complement operation yields the 
state |0a;') — c% \ lx), which is exactly what we want. 

The inital permutation, the circuit -BC7V/2 and the Hadamard gate in Figure 11 can be implemented with 
0(logA^) elementary gates. It is crucial that additional workbits are available, otherwise the complexity will 
increase to Q{\og'^ N). The Lemma in Section 3 then completes the proof of the following theorem: 

Theorem 7.1. There exists a recursive implementation of the discrete Hadamard transform Hn on a quantum 
computer with 0{log^ N) elementary gates (that is, controlled NOT gates and single qubit gates), assuming that 
additional workbits are available. 

8. CONCLUSIONS 

We have presented a new approach to the design of quantum algorithms. The method takes advantage of an 
dividc-and-conquer approach. We have illustrated the method in the design of quantum algorithms for the 
Fourier, Walsh, Slant, and Hartley transforms. The same method can be applied to derive fast algorithms for 
various discrete Cosine transforms. It might seem surprising that divide-and-conquer methods have not been 
previously suggested in quantum computing (to the best of our knowledge). One reason might be that the 
quantum circuit model implements only straight-line programs. We defined recursions on top of that model, 
similar to macro expansions in many classical programming languages. The benefit is that many circuits can 
be specified in a very lucid way. 

It should be emphasized that our divide-and-conquer approach is completely general. It can be applied to 
a much larger class of circuits, and is of course not restricted to signal processing applications. Moreover, it 
should be emphasized that many variations of this method are possible. We would like to encourage the reader 
to work out a few examples - quite often this is a simple exercise. 
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